131 research outputs found

    Referent Selection Strategies in Case-Crossover Analyses of Air Pollution Exposure Data: Implications for Bias

    Get PDF
    The case-crossover design has been widely used to study the association between short term air pollution exposure and the risk of an acute adverse health event. The design uses cases only, and, for each individual, compares exposure just prior to the event with exposure at other control, or “referent” times. By making within-subject comparisons, time invariant confounders are controlled by design. Even more important in the air pollution setting is that, by matching referents to the index time, time varying confounders can also be controlled by design. Yet, the referent selection strategy is important for reasons other than control of confounding. The case-crossover design makes the implicit assumption that there is no trend in exposure across the referent times. In addition, the statistical method that is employed, conditional logistic regression, is only unbiased with certain referent strategies. This paper reviews the case-crossover literature in the air pollution context, focusing on key referent selection issues. It concludes with a set of recommendations for choosing a referent strategy with air pollution exposure data. We advocate the time stratified approach to referent selection because it ensures unbiased conditional logistic regression estimates, avoids bias due to time trend in the exposure series, and can be tailored to match on specific time-varying confounders

    Statistical Analysis of Air Pollution Panel Studies: An Illustration

    Get PDF
    The panel study design is commonly used to evaluate the short-term health effects of air pollution. Standard statistical methods for analyzing longitudinal data are available, but the literature reveals that the techniques are not well understood by practitioners. We illustrate these methods using data from the 1999 to 2002 Seattle panel study. Marginal, conditional, and transitional approaches for modeling longitudinal data are reviewed and contrasted with respect to their parameter interpretation and methods for accounting for correlation and dealing with missing data. We also discuss and illustrate techniques for controlling for time-dependent and time-independent confounding, and for exploring and summarizing panel study data. Notes on available software are included in the appendix

    Overlap Bias in the Case-Crossover Design, With Application to Air Pollution Exposures

    Get PDF
    The case-crossover design uses cases only, and compares exposures just prior to the event times to exposures at comparable control, or “referent” times, in order to assess the effect of short-term exposure on the risk of a rare event. It has commonly been used to study the effect of air pollution on the risk of various adverse health events. Proper selection of referents is crucial, especially with air pollution exposures, which are shared, highly seasonal, and often have a long term time trend. Hence, careful referent selection is important to control for time-varying confounders, and in order to ensure that the distribution of exposure is constant across referent times, a key assumption of this method. Yet the referent strategy is important for a more basic reason: the conditional logistic regression estimating equations commonly used are biased when referents are not chosen a priori and are functions of the observed event times. We call this bias in the estimating equations overlap bias. In this paper, we propose a new taxonomy of referent selection strategies in order to emphasize their statistical properties. We give a derivation of overlap bias, explore its magnitude, and consider how the bias depends on properties of the exposure series. We conclude that the bias is usually small, though highly unpredictable, and easily avoided

    Accounting for Errors from Predicting Exposures in Environmental Epidemiology and Environmental Statistics

    Get PDF
    PLEASE NOTE THAT AN UPDATED VERSION OF THIS RESEARCH IS AVAILABLE AS WORKING PAPER 350 IN THE UNIVERSITY OF WASHINGTON BIOSTATISTICS WORKING PAPER SERIES (http://www.bepress.com/uwbiostat/paper350). In environmental epidemiology and related problems in environmental statistics, it is typically not practical to directly measure the exposure for each subject. Environmental monitoring is employed with a statistical model to assign exposures to individuals. The result is a form of exposure misspecification that can result in complicated errors in the health effect estimates if the exposure is naively treated as known. The exposure error is neither “classical” nor “Berkson”, so standard regression calibration methods do not apply. We decompose the health effect estimation error into three components. First, the standard errors are too small if the exposure field is correlated, independent of variability in estimating the exposure field parameters. Second, the standard errors are too small because they do not account for variability in estimating the exposure field parameters. Third, there is a bias from using approximate exposure field parameters in place of the unobserved true ones. We outline a three-stage correction procedure to account separately for each of these errors. A key insight is that we can account for the second part of the error (sampling variability in estimating the exposure) by averaging over simulations from the part of the posterior exposure surface that is informative for the outcome. This amounts to averaging over samples of the posterior exposure model parameters, a procedure that we call “parameter simulation”. One implication is that it is preferable to use a parametric correlation model (e.g., kriging) rather than a semi-parametric approximation. While the latter approach has been found to be effective in estimating mean exposure fields, it does not provide the needed decomposition of the posterior into informative and non-informative components. We illustrate the properties of our corrected estimators in a simulation study and present an example from environmental statistics. The focus of this paper is on linear health effect models with uncorrelated outcomes, but extensions to generalized linear models and correlated outcomes are possible

    Influence of prediction approaches for spatially-dependent air pollution exposure on health effect estimation

    Get PDF
    Background: Air pollution studies increasingly estimate individual-level exposures from area-based measurements by using exposure prediction methods such as nearest monitor and kriging predictions. However, little is known about the properties of these methods for health effects estimation. This simulation study explores how two common prediction approaches for fine particulate matter (PM2.5) affect relative risk estimates for cardiovascular events in a single geographic area. Methods: We estimated two sets of parameters to define correlation structures from 2002 PM2.5 data in the Los Angeles (LA) area and selected additional parameters to evaluate different correlation features. For each structure, annual average PM2.5 was generated at 22 existing monitoring sites and 2,000 pre-selected individual locations in LA. Associated survival time until cardiovascular event was simulated for 10,000 hypothetical subjects. Using PM2.5 generated at monitoring sites, we predicted PM2.5 at subject locations by nearest monitor and kriging interpolation. Finally, relative risks (RRs) of the effect of PM2.5 on time to cardiovascular event were estimated. Results: Health effect estimates for cardiovascular events had higher or similar coverage probability for kriging compared to nearest monitor exposures. The lower mean square error of nearest monitor prediction resulted from more precise but biased health effect estimates. The difference between these approaches dramatically moderated when spatial correlation increased and geographical characteristics were included in the mean model. Conclusions: When the underlying exposure distribution has a large amount of spatial dependence, both kriging and nearest monitor predictions gave good health effect estimates. For exposure with little spatial dependence, kriging exposure was preferable but gave very uncertain estimates

    Efficient Measurement Error Correction with Spatially Misaligned Data

    Get PDF
    Association studies in environmental statistics often involve exposure and outcome data that are misaligned in space. A common strategy is to employ a spatial model such as universal kriging to predict exposures at locations with outcome data and then estimate a regression parameter of interest using the predicted exposures. This results in measurement error because the predicted exposures do not correspond exactly to the true values. We characterize the measurement error by decomposing it into Berkson-like and classical-like components. One correction approach is the parametric bootstrap, which is effective but computationally intensive since it requires solving a nonlinear optimization problem for the exposure model parameters in each bootstrap sample. We propose a less computationally intensive alternative termed the ``parameter bootstrap\u27\u27 that only requires solving one nonlinear optimization problem, and we also compare bootstrap methods to other recently proposed methods. We illustrate our methodology in simulations and with publicly available data from the Environmental Protection Agency

    Reduced-rank spatio-temporal modeling of air pollution concentrations in the Multi-Ethnic Study of Atherosclerosis and Air Pollution

    Full text link
    There is growing evidence in the epidemiologic literature of the relationship between air pollution and adverse health outcomes. Prediction of individual air pollution exposure in the Environmental Protection Agency (EPA) funded Multi-Ethnic Study of Atheroscelerosis and Air Pollution (MESA Air) study relies on a flexible spatio-temporal prediction model that integrates land-use regression with kriging to account for spatial dependence in pollutant concentrations. Temporal variability is captured using temporal trends estimated via modified singular value decomposition and temporally varying spatial residuals. This model utilizes monitoring data from existing regulatory networks and supplementary MESA Air monitoring data to predict concentrations for individual cohort members. In general, spatio-temporal models are limited in their efficacy for large data sets due to computational intractability. We develop reduced-rank versions of the MESA Air spatio-temporal model. To do so, we apply low-rank kriging to account for spatial variation in the mean process and discuss the limitations of this approach. As an alternative, we represent spatial variation using thin plate regression splines. We compare the performance of the outlined models using EPA and MESA Air monitoring data for predicting concentrations of oxides of nitrogen (NOx_x)-a pollutant of primary interest in MESA Air-in the Los Angeles metropolitan area via cross-validated R2R^2. Our findings suggest that use of reduced-rank models can improve computational efficiency in certain cases. Low-rank kriging and thin plate regression splines were competitive across the formulations considered, although TPRS appeared to be more robust in some settings.Comment: Published in at http://dx.doi.org/10.1214/14-AOAS786 the Annals of Applied Statistics (http://www.imstat.org/aoas/) by the Institute of Mathematical Statistics (http://www.imstat.org

    Issues Related to Combining Multiple Speciated PM2.5 Data Sources in Spatio-Temporal Exposure Models for Epidemiology: The NPACT Case Study

    Get PDF
    Background: Regulatory monitoring data have been the most common exposure data resource in studies of the association between long-term PM2.5 components and health. However, data collected for regulatory purposes may not be compatible with epidemiological study. Objectives: We aimed to explore three important features of the PM2.5 component monitoring data obtained from multiple sources to combine all available data for developing spatio-temporal prediction models in the National Particle Component and Toxicity (NPACT) study. Methods: The NPACT monitoring data were collected in an extensive monitoring campaign targeting cohort participants. The regulatory monitoring data were obtained from the Chemical Speciation Network (CSN) and the Interagency Monitoring of Protected Visual Environments (IMPROVE). We performed exploratory analyses to examine three features that could affect our approach to combining data: comprehensiveness of spatial coverage, comparability of analysis methods, and consistency in sampling protocols. In addition, we considered the viability of developing a spatio-temporal prediction model given: 1) all available data; 2) NPACT data only; and 3) NPACT data with temporal trends estimated from other pollutants. Results: The number of CSN/IMPROVE monitors was limited in all study areas. The different laboratory analysis methods and the protocol differences for sampling resulted in incompatible measurements between networks. Given these features, we determined that it was preferable to develop our spatio-temporal model using only the NPACT data and under simplifying assumptions. Conclusions: Investigators conducting epidemiological studies of long-term PM2.5 components need to be mindful of the features of the monitoring data and incorporate this understanding into exposure model development

    Risk Factors for Long-Term Coronary Artery Calcium Progression in the Multi-Ethnic Study of Atherosclerosis.

    Get PDF
    BackgroundCoronary artery calcium (CAC) detected by noncontrast cardiac computed tomography scanning is a measure of coronary atherosclerosis burden. Increasing CAC levels have been strongly associated with increased coronary events. Prior studies of cardiovascular disease risk factors and CAC progression have been limited by short follow-up or restricted to patients with advanced disease.Methods and resultsWe examined cardiovascular disease risk factors and CAC progression in a prospective multiethnic cohort study. CAC was measured 1 to 4 times (mean 2.5 scans) over 10 years in 6810 adults without preexisting cardiovascular disease. Mean CAC progression was 23.9 Agatston units/year. An innovative application of mixed-effects models investigated associations between cardiovascular disease risk factors and CAC progression. This approach adjusted for time-varying factors, was flexible with respect to follow-up time and number of observations per participant, and allowed simultaneous control of factors associated with both baseline CAC and CAC progression. Models included age, sex, study site, scanner type, and race/ethnicity. Associations were observed between CAC progression and age (14.2 Agatston units/year per 10 years [95% CI 13.0 to 15.5]), male sex (17.8 Agatston units/year [95% CI 15.3 to 20.3]), hypertension (13.8 Agatston units/year [95% CI 11.2 to 16.5]), diabetes (31.3 Agatston units/year [95% CI 27.4 to 35.3]), and other factors.ConclusionsCAC progression analyzed over 10 years of follow-up, with a novel analytical approach, demonstrated strong relationships with risk factors for incident cardiovascular events. Longitudinal CAC progression analyzed in this framework can be used to evaluate novel cardiovascular risk factors
    • …
    corecore